refact: refactor format backends into dpdata.formats by njzjz-bot · Pull Request #970 · deepmodeling/dpdata

njzjz-bot · 2026-05-05T07:43:26Z

This PR fixes the CI failures from #946 after moving implementation modules.

Changes:

Keep real format backends under dpdata.formats.
Move the non-format MD analysis/tool modules back to dpdata.md instead of dpdata.formats.md.
Do not preserve dpdata.lammps / dpdata.vasp as top-level exports.
Add explicit package exports for the newly moved format subpackages under dpdata.formats.
Update direct helper imports in tests/internal code to their new locations:
- dpdata.formats.cp2k.cell.cell_to_low_triangle
- dpdata.formats.gaussian.gjf.detect_multiplicity
- dpdata.formats.qe.traj.convert_celldm
- dpdata.formats.amber.md.cell_lengths_angles_to_cell
- dpdata.md.msd.msd
- dpdata.md.water.*

Follow-up:

Removed the legacy dpdata.<format> wrapper modules that were added earlier; this branch no longer keeps those old import paths alive.

Local checks:

cd tests && uv run pytest test_amber_md.py test_cell_to_low_triangle.py test_gaussian_driver.py::TestMakeGaussian::test_detect_multiplicity test_qe_cp_traj.py::TestConverCellDim test_msd.py test_water_ions.py -q → 45 passed
uv run pyright → currently reports 2 pre-existing missing _version / __version__ diagnostics in dpdata/__init__.py and dpdata/cli.py
git grep -n "from dpdata\.formats\..* import \*\|legacy" -- dpdata tests → no matches

Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)

Summary by CodeRabbit

Release Notes

Refactor
- Reorganized internal format handler modules into a dedicated formats subdirectory for improved code structure and maintainability.
- Updated internal import paths throughout the codebase to reflect the new module organization structure.

codspeed-hq · 2026-05-05T07:44:49Z

Merging this PR will improve performance by 22.65%

⚠️

Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚠️

Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 2 improved benchmarks

Performance Changes

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
⚡	WallTime	`test_import`	11.1 ms	9.6 ms	+15.37%
⚡	WallTime	`test_cli`	369.9 ms	301.6 ms	+22.65%

_{Comparing njzjz-bot:oc-fix-pr-946-ci (d6252fc) with master (6cdc360)}

codecov · 2026-05-05T07:45:14Z

Codecov Report

❌ Patch coverage is 98.27586% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.75%. Comparing base (6cdc360) to head (d6252fc).

Files with missing lines	Patch %	Lines
dpdata/bond_order_system.py	75.00%	1 Missing ⚠️
dpdata/plugins/amber.py	85.71%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #970      +/-   ##
==========================================
+ Coverage   86.73%   86.75%   +0.01%     
==========================================
  Files          86       89       +3     
  Lines        8084     8093       +9     
==========================================
+ Hits         7012     7021       +9     
  Misses       1072     1072

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai · 2026-05-05T07:49:37Z

📝 Walkthrough

Walkthrough

This PR centralizes format-related modules under a new dpdata.formats package and updates import paths across the codebase to reference dpdata.formats.*. Package initializers for cp2k, gaussian, and qe were added and many relative imports adjusted for the deeper package nesting.

Changes

Module reorganization into dpdata.formats (single cohesive DAG)

Layer / File(s)	Summary
Package initializers & top-level `dpdata/formats/__init__.py`, `dpdata/formats/cp2k/__init__.py`, `dpdata/formats/gaussian/__init__.py`, `dpdata/formats/qe/__init__.py`, `dpdata/__init__.py`	Added formats package comment and created cp2k/gaussian/qe package inits (with `__all__`); `dpdata/__init__py` now re-exports only the intended top-level names (keeps `md`, `System`, `LabeledSystem`, `MultiSystems`, `BondOrderSystem`, `__version__`).
Format internal imports `dpdata/formats/...` (abacus/scf.py, abacus/stru.py, gaussian/fchk.py, gaussian/log.py, gromacs/gro.py, openmx/omx.py, pwmat/atomconfig.py, pwmat/movement.py, qe/traj.py, cp2k/output.py, ...)	Updated relative imports inside format modules to account for deeper nesting (e.g., `..unit` → `...unit`, `..periodic_table` → `...periodic_table`).
Core system / bond-order wiring `dpdata/bond_order_system.py`, `dpdata/system.py`	Switched RDKit, PBC and Amber mask utility imports to `dpdata.formats.*` and updated corresponding internal calls (e.g., `system_data_to_mol`, `mol_to_system_data`, `dir_coord`).
Plugin wiring (imports & call sites) `dpdata/plugins/*` (abacus, amber, cp2k, deepmd, dftbplus, fhi_aims, gaussian, gromacs, lammps, lmdb, openmx, orca, psi4, pwmat, pymatgen, qe, rdkit, siesta, vasp, xyz, 3dmol, ...)	Repointed plugin imports and format helper calls from `dpdata.<format>` to `dpdata.formats.<format>` and adjusted call targets accordingly (no signature or control-flow changes).
Large-format adapters & misc `dpdata/plugins/deepmd.py`, `dpdata/plugins/lmdb.py`, `dpdata/formats/...`	DeepMD and HDF5 handling now route through `dpdata.formats.deepmd.*`; LMDB import updated to `dpdata.formats.lmdb.format`; related adapters updated similarly.
Tests updated `tests/*` (context.py, test_abacus_stru_dump.py, test_lammps_lmp_dump.py, test_lammps_spin.py, test_lmdb.py, test_cell_to_low_triangle.py, test_gaussian_driver.py, test_msd.py, test_qe_cp_traj.py, test_water_ions.py, ...)	Updated test imports and call sites to use `dpdata.formats.*` equivalents; test logic and assertions remain unchanged.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

deepmodeling/dpdata#928: Related LMDB format additions and prior import paths for LMDBFormat.
deepmodeling/dpdata#967: Related OpenMX parser work that touches the same modules.

Suggested labels

size:XL, lgtm

Suggested reviewers

njzjz

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 37.84% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: refactoring format backends into the dpdata.formats package, which is the primary objective across all modified files.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

dpdata/bond_order_system.py (1)

81-91: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

data-only initialization still raises ValueError due to branch structure

When data is provided without file_name/rdkit_mol, Line 81 initializes from data, but control still reaches the else at Line 91 and raises. This breaks the documented data init path.

Proposed fix

-        if data:
+        if data is not None:
             mol = dpdata.formats.rdkit.utils.system_data_to_mol(data)
             self.from_rdkit_mol(mol)
-        if file_name:
+        elif file_name:
             self.from_fmt(
                 file_name, fmt, type_map=type_map, begin=begin, step=step, **kwargs
             )
-        elif rdkit_mol:
+        elif rdkit_mol is not None:
             self.from_rdkit_mol(rdkit_mol)
         else:
             raise ValueError("Please specify a mol/sdf file or a rdkit Mol object")

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dpdata/bond_order_system.py` around lines 81 - 91, The init path processes
`data` but then continues into the file/rdkit branches and hits the final else,
causing the erroneous ValueError; update the conditional flow in the
BondOrderSystem initializer (or the method handling construction) so that the
`data` case stops further branching—either change the `if data:` block to `if
data: ... elif file_name: ... elif rdkit_mol: ... else: ...` or keep `if data:`
and add an immediate return after calling from_rdkit_mol; ensure you reference
the existing calls to dpdata.formats.rdkit.utils.system_data_to_mol,
self.from_rdkit_mol, and self.from_fmt when making the change.

dpdata/plugins/pymatgen.py (1)

72-72: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add missing import dpdata.system to fix runtime AttributeError.

Line 72 uses dpdata.system.remove_pbc(data), but dpdata.system is not imported. The dpdata package does not re-export the system module in its __init__.py—only individual classes from it are exposed. This will raise AttributeError at runtime when to_system is called on a PyMatgenMoleculeFormat instance.
Proposed fix — add explicit import
 import dpdata.formats.pymatgen.molecule
 import dpdata.formats.pymatgen.structure
+import dpdata.system
 from dpdata.format import Format
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dpdata/plugins/pymatgen.py` at line 72, The call data =
dpdata.system.remove_pbc(data) fails at runtime because the dpdata.system
submodule isn't imported; add an explicit import (e.g., import dpdata.system) at
the top of the file and then leave the call in PyMatgenMoleculeFormat.to_system
as-is so dpdata.system.remove_pbc(data) resolves correctly.

🧹 Nitpick comments (1)

dpdata/plugins/xyz.py (1)
11-14: ⚡ Quick win

Runtime imports placed after the if TYPE_CHECKING: block — invert the order.

Lines 13–14 are unconditional runtime imports but appear after the if TYPE_CHECKING: guard. The conventional (and ruff/isort-expected) layout places the if TYPE_CHECKING: block last among all imports. This ordering may trigger an I001 violation depending on the project's ruff configuration.
♻️ Proposed fix
+from dpdata.formats.xyz.quip_gap_xyz import QuipGapxyzSystems, format_single_frame
+from dpdata.formats.xyz.xyz import coord_to_xyz, xyz_to_coord
+
 if TYPE_CHECKING:
     from dpdata.utils import FileType
-from dpdata.formats.xyz.quip_gap_xyz import QuipGapxyzSystems, format_single_frame
-from dpdata.formats.xyz.xyz import coord_to_xyz, xyz_to_coord
As per coding guidelines, dpdata/**/*.py files must pass ruff check dpdata/ before committing.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dpdata/plugins/xyz.py` around lines 11 - 14, The import ordering is wrong:
move the TYPE_CHECKING block so it comes after the runtime imports (i.e., place
the "if TYPE_CHECKING: from dpdata.utils import FileType" block below the
imports of QuipGapxyzSystems, format_single_frame, coord_to_xyz, and
xyz_to_coord) to satisfy ruff/isort expectations and avoid I001; ensure the
runtime symbols QuipGapxyzSystems, format_single_frame, coord_to_xyz, and
xyz_to_coord remain imported unconditionally and only FileType is guarded by
TYPE_CHECKING.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@dpdata/__init__.py`:
- Line 4: Run ruff check and fix the reported lint issues: sort the __all__
lists in dpdata/__init__ and dpdata/formats/deepmd/hdf5.py, replace mutable
default args/attributes in functions/classes (e.g., in ase_calculator.py,
rdf.py, water.py) with None and set defaults inside the function or __init__,
add explicit stacklevel=2 to all warnings.warn calls, rename variables shadowing
builtins (e.g., in driver.py, hdf5.py and pwmat-related files), remove or use
unused loop control variables (abacus, cp2k, fhi_aims, gromacs, lammps, md,
openmx, pwmat modules) or replace with _ if intentionally unused, add strict=...
to zip() calls, and address the remaining RUF/B/BLE/etc. issues (unused
unpacking, unnecessary conversions/concatenations, empty abstract methods,
assert False, missing shebangs, ambiguous names, overly broad exception
handlers) as indicated by ruff to make the codebase clean.

In `@dpdata/plugins/openmx.py`:
- Line 64: The unpacked variable `cs` from the call to
dpdata.formats.openmx.omx.to_system_data(fname, mdname) is unused and causes a
Ruff RUF059 lint error; update the unpack to either capture the unused value as
`_` (e.g., `data, _ = ...`) or assign only `data` (e.g., `data = ...`) inside
openmx.py where the call occurs, and then run ruff check dpdata/ and ruff format
dpdata/ to ensure linting/formatting compliance.

---

Outside diff comments:
In `@dpdata/bond_order_system.py`:
- Around line 81-91: The init path processes `data` but then continues into the
file/rdkit branches and hits the final else, causing the erroneous ValueError;
update the conditional flow in the BondOrderSystem initializer (or the method
handling construction) so that the `data` case stops further branching—either
change the `if data:` block to `if data: ... elif file_name: ... elif rdkit_mol:
... else: ...` or keep `if data:` and add an immediate return after calling
from_rdkit_mol; ensure you reference the existing calls to
dpdata.formats.rdkit.utils.system_data_to_mol, self.from_rdkit_mol, and
self.from_fmt when making the change.

In `@dpdata/plugins/pymatgen.py`:
- Line 72: The call data = dpdata.system.remove_pbc(data) fails at runtime
because the dpdata.system submodule isn't imported; add an explicit import
(e.g., import dpdata.system) at the top of the file and then leave the call in
PyMatgenMoleculeFormat.to_system as-is so dpdata.system.remove_pbc(data)
resolves correctly.

---

Nitpick comments:
In `@dpdata/plugins/xyz.py`:
- Around line 11-14: The import ordering is wrong: move the TYPE_CHECKING block
so it comes after the runtime imports (i.e., place the "if TYPE_CHECKING: from
dpdata.utils import FileType" block below the imports of QuipGapxyzSystems,
format_single_frame, coord_to_xyz, and xyz_to_coord) to satisfy ruff/isort
expectations and avoid I001; ensure the runtime symbols QuipGapxyzSystems,
format_single_frame, coord_to_xyz, and xyz_to_coord remain imported
unconditionally and only FileType is guarded by TYPE_CHECKING.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9d56fd68-bd9e-45f8-b025-b4160b3f00b9

📥 Commits

Reviewing files that changed from the base of the PR and between 99ea3bb and a3ddf7f.

📒 Files selected for processing (99)

dpdata/__init__.py
dpdata/bond_order_system.py
dpdata/formats/__init__.py
dpdata/formats/abacus/__init__.py
dpdata/formats/abacus/md.py
dpdata/formats/abacus/relax.py
dpdata/formats/abacus/scf.py
dpdata/formats/abacus/stru.py
dpdata/formats/amber/__init__.py
dpdata/formats/amber/mask.py
dpdata/formats/amber/md.py
dpdata/formats/amber/sqm.py
dpdata/formats/cp2k/__init__.py
dpdata/formats/cp2k/cell.py
dpdata/formats/cp2k/output.py
dpdata/formats/deepmd/__init__.py
dpdata/formats/deepmd/comp.py
dpdata/formats/deepmd/hdf5.py
dpdata/formats/deepmd/mixed.py
dpdata/formats/deepmd/raw.py
dpdata/formats/dftbplus/__init__.py
dpdata/formats/dftbplus/output.py
dpdata/formats/fhi_aims/__init__.py
dpdata/formats/fhi_aims/output.py
dpdata/formats/gaussian/__init__.py
dpdata/formats/gaussian/fchk.py
dpdata/formats/gaussian/gjf.py
dpdata/formats/gaussian/log.py
dpdata/formats/gromacs/__init__.py
dpdata/formats/gromacs/gro.py
dpdata/formats/lammps/__init__.py
dpdata/formats/lammps/dump.py
dpdata/formats/lammps/lmp.py
dpdata/formats/lmdb/__init__.py
dpdata/formats/lmdb/format.py
dpdata/formats/md/__init__.py
dpdata/formats/md/msd.py
dpdata/formats/md/pbc.py
dpdata/formats/md/rdf.py
dpdata/formats/md/water.py
dpdata/formats/openmx/__init__.py
dpdata/formats/openmx/omx.py
dpdata/formats/orca/__init__.py
dpdata/formats/orca/output.py
dpdata/formats/psi4/__init__.py
dpdata/formats/psi4/input.py
dpdata/formats/psi4/output.py
dpdata/formats/pwmat/__init__.py
dpdata/formats/pwmat/atomconfig.py
dpdata/formats/pwmat/movement.py
dpdata/formats/pymatgen/__init__.py
dpdata/formats/pymatgen/molecule.py
dpdata/formats/pymatgen/structure.py
dpdata/formats/qe/__init__.py
dpdata/formats/qe/scf.py
dpdata/formats/qe/traj.py
dpdata/formats/rdkit/__init__.py
dpdata/formats/rdkit/sanitize.py
dpdata/formats/rdkit/utils.py
dpdata/formats/siesta/__init__.py
dpdata/formats/siesta/aiMD_output.py
dpdata/formats/siesta/output.py
dpdata/formats/vasp/__init__.py
dpdata/formats/vasp/outcar.py
dpdata/formats/vasp/poscar.py
dpdata/formats/vasp/xml.py
dpdata/formats/xyz/__init__.py
dpdata/formats/xyz/quip_gap_xyz.py
dpdata/formats/xyz/xyz.py
dpdata/plugins/3dmol.py
dpdata/plugins/abacus.py
dpdata/plugins/amber.py
dpdata/plugins/cp2k.py
dpdata/plugins/deepmd.py
dpdata/plugins/dftbplus.py
dpdata/plugins/fhi_aims.py
dpdata/plugins/gaussian.py
dpdata/plugins/gromacs.py
dpdata/plugins/lammps.py
dpdata/plugins/lmdb.py
dpdata/plugins/openmx.py
dpdata/plugins/orca.py
dpdata/plugins/psi4.py
dpdata/plugins/pwmat.py
dpdata/plugins/pymatgen.py
dpdata/plugins/qe.py
dpdata/plugins/rdkit.py
dpdata/plugins/siesta.py
dpdata/plugins/vasp.py
dpdata/plugins/xyz.py
dpdata/siesta/__init__.py
dpdata/system.py
dpdata/vasp/__init__.py
dpdata/xyz/__init__.py
tests/context.py
tests/test_abacus_stru_dump.py
tests/test_lammps_lmp_dump.py
tests/test_lammps_spin.py
tests/test_lmdb.py

Copilot

Pull request overview

This PR rebases the prior directory reorganization (moving format implementations under dpdata.formats) and updates internal imports/tests accordingly, with the stated goal of restoring backward-compatible access to historically public helper modules/namespaces after the move.

Changes:

Relocates/introduces many format implementation modules under dpdata/formats/** (e.g., VASP/LAMMPS/QE/Gaussian/CP2K/LMDB, etc.).
Updates plugin modules and some core code to import from dpdata.formats.* instead of historical locations.
Adds a lazy attribute-based loader in dpdata/__init__.py for some format modules (cp2k, gaussian, qe).

Reviewed changes

Copilot reviewed 44 out of 99 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`dpdata/__init__.py`	Switches top-level exports to `dpdata.formats.*` and adds lazy `__getattr__` for some format modules.
`dpdata/system.py`	Updates internal imports to `dpdata.formats.md.pbc` and `dpdata.formats.amber.mask`.
`dpdata/bond_order_system.py`	Updates RDKit helpers import path to `dpdata.formats.rdkit.*`.
`dpdata/plugins/3dmol.py`	Updates XYZ helper import path to `dpdata.formats.xyz.xyz`.
`dpdata/plugins/abacus.py`	Points ABACUS plugin imports to `dpdata.formats.abacus.*`.
`dpdata/plugins/amber.py`	Points Amber plugin imports to `dpdata.formats.amber.*`.
`dpdata/plugins/cp2k.py`	Points CP2K plugin imports to `dpdata.formats.cp2k.output`.
`dpdata/plugins/deepmd.py`	Points DeePMD plugin imports to `dpdata.formats.deepmd.*`.
`dpdata/plugins/dftbplus.py`	Points DFTB+ plugin imports to `dpdata.formats.dftbplus.output`.
`dpdata/plugins/fhi_aims.py`	Points FHI-aims plugin imports to `dpdata.formats.fhi_aims.output`.
`dpdata/plugins/gaussian.py`	Points Gaussian plugin imports to `dpdata.formats.gaussian.*` and updates doc references.
`dpdata/plugins/gromacs.py`	Points Gromacs plugin imports to `dpdata.formats.gromacs.gro`.
`dpdata/plugins/lammps.py`	Points LAMMPS plugin imports to `dpdata.formats.lammps.*`.
`dpdata/plugins/lmdb.py`	Registers LMDB format via `dpdata.formats.lmdb.format.LMDBFormat`.
`dpdata/plugins/openmx.py`	Points OpenMX plugin imports to `dpdata.formats.openmx.*` and `dpdata.formats.md.pbc`.
`dpdata/plugins/orca.py`	Points ORCA plugin imports to `dpdata.formats.orca.output`.
`dpdata/plugins/psi4.py`	Points Psi4 plugin imports to `dpdata.formats.psi4.*`.
`dpdata/plugins/pwmat.py`	Points PWMAT plugin imports to `dpdata.formats.pwmat.*`.
`dpdata/plugins/pymatgen.py`	Points pymatgen plugin imports to `dpdata.formats.pymatgen.*`.
`dpdata/plugins/qe.py`	Points QE plugin imports to `dpdata.formats.qe.*` and `dpdata.formats.md.pbc`.
`dpdata/plugins/rdkit.py`	Points RDKit plugin imports to `dpdata.formats.rdkit.utils`.
`dpdata/plugins/siesta.py`	Points SIESTA plugin imports to `dpdata.formats.siesta.*`.
`dpdata/plugins/vasp.py`	Points VASP plugin imports to `dpdata.formats.vasp.*`.
`dpdata/plugins/xyz.py`	Points XYZ plugin imports to `dpdata.formats.xyz.*`.
`dpdata/formats/__init__.py`	Introduces the `dpdata.formats` package marker.
`dpdata/formats/abacus/md.py`	Adds/relocates ABACUS MD reader under formats.
`dpdata/formats/abacus/relax.py`	Adds/relocates ABACUS relax reader under formats.
`dpdata/formats/abacus/scf.py`	Adjusts imports for ABACUS SCF under formats.
`dpdata/formats/abacus/stru.py`	Adjusts imports for ABACUS STRU under formats.
`dpdata/formats/amber/mask.py`	Adds/relocates Amber mask utilities under formats.
`dpdata/formats/amber/md.py`	Adjusts imports for Amber MD under formats.
`dpdata/formats/amber/sqm.py`	Adds/relocates SQM parsing/input generation under formats.
`dpdata/formats/cp2k/cell.py`	Adds/relocates CP2K cell helper under formats.
`dpdata/formats/cp2k/output.py`	Adjusts imports for CP2K output reader under formats.
`dpdata/formats/deepmd/comp.py`	Adds/relocates deepmd/npy (“comp”) support under formats.
`dpdata/formats/deepmd/hdf5.py`	Adds/relocates deepmd/hdf5 support under formats.
`dpdata/formats/deepmd/mixed.py`	Adds/relocates deepmd mixed-type utilities under formats.
`dpdata/formats/deepmd/raw.py`	Adds/relocates deepmd/raw support under formats.
`dpdata/formats/dftbplus/output.py`	Adds/relocates DFTB+ output reader under formats.
`dpdata/formats/fhi_aims/output.py`	Adds/relocates FHI-aims output reader under formats.
`dpdata/formats/gaussian/fchk.py`	Adjusts relative imports for Gaussian fchk under formats.
`dpdata/formats/gaussian/gjf.py`	Adds/relocates Gaussian input generator/parser under formats.
`dpdata/formats/gaussian/log.py`	Adjusts relative imports for Gaussian log under formats.
`dpdata/formats/gromacs/gro.py`	Adjusts relative imports for Gromacs gro under formats.
`dpdata/formats/lammps/dump.py`	Adds/relocates LAMMPS dump parsing/writing under formats.
`dpdata/formats/lammps/lmp.py`	Adds/relocates LAMMPS data-file parsing/writing under formats.
`dpdata/formats/lmdb/format.py`	Adds/relocates LMDB format implementation under formats.
`dpdata/formats/md/msd.py`	Adds/relocates MSD implementation under formats.
`dpdata/formats/md/pbc.py`	Adds/relocates PBC utilities under formats.
`dpdata/formats/md/rdf.py`	Adds/relocates RDF implementation under formats.
`dpdata/formats/md/water.py`	Adds/relocates water analysis utilities under formats.
`dpdata/formats/openmx/omx.py`	Adjusts relative imports for OpenMX under formats.
`dpdata/formats/orca/output.py`	Adds/relocates ORCA output reader under formats.
`dpdata/formats/psi4/input.py`	Adds/relocates Psi4 input writer under formats.
`dpdata/formats/psi4/output.py`	Adds/relocates Psi4 output reader under formats.
`dpdata/formats/pwmat/atomconfig.py`	Adjusts relative imports for PWMAT atomconfig under formats.
`dpdata/formats/pwmat/movement.py`	Adjusts relative imports for PWMAT movement under formats.
`dpdata/formats/pymatgen/molecule.py`	Adds/relocates pymatgen Molecule conversion under formats.
`dpdata/formats/pymatgen/structure.py`	Adds/relocates pymatgen Structure conversion under formats.
`dpdata/formats/qe/scf.py`	Adds/relocates QE SCF parsing under formats.
`dpdata/formats/qe/traj.py`	Fixes relative imports within QE traj under formats.
`dpdata/formats/rdkit/utils.py`	Adds/relocates RDKit helper utilities under formats.
`dpdata/formats/siesta/aiMD_output.py`	Adds/relocates SIESTA aiMD output reader under formats.
`dpdata/formats/siesta/output.py`	Adds/relocates SIESTA output reader under formats.
`dpdata/formats/vasp/outcar.py`	Adds/relocates VASP OUTCAR parsing under formats.
`dpdata/formats/vasp/poscar.py`	Adds/relocates VASP POSCAR parsing/writing under formats.
`dpdata/formats/vasp/xml.py`	Adds/relocates VASP XML parsing under formats.
`dpdata/formats/xyz/quip_gap_xyz.py`	Adds/relocates QUIP/GAP XYZ support under formats.
`dpdata/formats/xyz/xyz.py`	Adds/relocates basic XYZ conversions under formats.
`tests/context.py`	Updates import smoke-loading to `dpdata.formats.*`.
`tests/test_abacus_stru_dump.py`	Updates ABACUS test imports to `dpdata.formats.abacus.*`.
`tests/test_lammps_lmp_dump.py`	Updates LAMMPS test imports to `dpdata.formats.lammps.*`.
`tests/test_lammps_spin.py`	Updates LAMMPS test imports to `dpdata.formats.lammps.*`.
`tests/test_lmdb.py`	Updates LMDB test imports to `dpdata.formats.lmdb.*`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

wanghan-iapcm · 2026-05-06T04:45:01Z

conflicts should be resolved.

Move all format directories (abacus, amber, cp2k, deepmd, dftbplus, fhi_aims, gaussian, gromacs, lammps, lmdb, md, openmx, orca, psi4, pwmat, pymatgen, qe, rdkit, siesta, vasp, xyz) into a new formats/ subdirectory. This addresses issue deepmodeling#934. Changes: - Created dpdata/formats/ directory - Moved all format directories to dpdata/formats/ - Updated all import statements throughout the codebase - Updated relative imports in format modules (from .. to from ...) - Updated dpdata/__init__.py to import from new locations - Updated tests/context.py for new import paths The plugins directory remains at the root level as requested.

for more information, see https://pre-commit.ci

Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

dpdata/plugins/xyz.py (1)
1-117: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Run mandated Ruff checks before merge.

Please confirm ruff check dpdata/ and ruff format dpdata/ were run for this PR branch before merging.

As per coding guidelines, "Run ruff linting with ruff check dpdata/ and format code with ruff format dpdata/ before committing".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dpdata/plugins/xyz.py` around lines 1 - 117, Run the repository
linter/formatter on the dpdata package and commit any fixes: execute "ruff check
dpdata/" to identify issues and "ruff format dpdata/" to auto-format, then
re-run "ruff check dpdata/" to confirm there are no remaining offenses; ensure
changes (especially in the modified symbols/classes like XYZFormat and
QuipGapXYZFormat in dpdata/plugins/xyz.py and any related imports such as
coord_to_xyz, xyz_to_coord, QuipGapxyzSystems, format_single_frame) are staged
and committed before merging.

♻️ Duplicate comments (1)

dpdata/plugins/openmx.py (1)
64-64: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove unused unpacked variable to satisfy lint.

The variable cs is assigned but never used, triggering Ruff RUF059. This should use _ instead, matching the pattern already used on Line 38.
🔧 Proposed fix
-        data, cs = dpdata.formats.openmx.omx.to_system_data(fname, mdname)
+        data, _ = dpdata.formats.openmx.omx.to_system_data(fname, mdname)
As per coding guidelines, dpdata/**/*.py: Run ruff linting with ruff check dpdata/ and format code with ruff format dpdata/ before committing.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dpdata/plugins/openmx.py` at line 64, The assignment unpacks two values from
dpdata.formats.openmx.omx.to_system_data(fname, mdname) into variables `data,
cs` but `cs` is unused and triggers RUF059; change the unused unpacked variable
`cs` to `_` (i.e., `data, _ = dpdata.formats.openmx.omx.to_system_data(fname,
mdname)`) in the dpdata.plugins.openmx module and then run ruff check/format on
the dpdata/ package as per guidelines.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@dpdata/plugins/xyz.py`:
- Around line 1-117: Run the repository linter/formatter on the dpdata package
and commit any fixes: execute "ruff check dpdata/" to identify issues and "ruff
format dpdata/" to auto-format, then re-run "ruff check dpdata/" to confirm
there are no remaining offenses; ensure changes (especially in the modified
symbols/classes like XYZFormat and QuipGapXYZFormat in dpdata/plugins/xyz.py and
any related imports such as coord_to_xyz, xyz_to_coord, QuipGapxyzSystems,
format_single_frame) are staged and committed before merging.

---

Duplicate comments:
In `@dpdata/plugins/openmx.py`:
- Line 64: The assignment unpacks two values from
dpdata.formats.openmx.omx.to_system_data(fname, mdname) into variables `data,
cs` but `cs` is unused and triggers RUF059; change the unused unpacked variable
`cs` to `_` (i.e., `data, _ = dpdata.formats.openmx.omx.to_system_data(fname,
mdname)`) in the dpdata.plugins.openmx module and then run ruff check/format on
the dpdata/ package as per guidelines.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b2ea20e0-3772-4448-8e73-4cff8fb7f254

📥 Commits

Reviewing files that changed from the base of the PR and between 2482d02 and d6252fc.

📒 Files selected for processing (100)

dpdata/__init__.py
dpdata/bond_order_system.py
dpdata/formats/__init__.py
dpdata/formats/abacus/__init__.py
dpdata/formats/abacus/md.py
dpdata/formats/abacus/relax.py
dpdata/formats/abacus/scf.py
dpdata/formats/abacus/stru.py
dpdata/formats/amber/__init__.py
dpdata/formats/amber/mask.py
dpdata/formats/amber/md.py
dpdata/formats/amber/sqm.py
dpdata/formats/cp2k/__init__.py
dpdata/formats/cp2k/cell.py
dpdata/formats/cp2k/output.py
dpdata/formats/deepmd/__init__.py
dpdata/formats/deepmd/comp.py
dpdata/formats/deepmd/hdf5.py
dpdata/formats/deepmd/mixed.py
dpdata/formats/deepmd/raw.py
dpdata/formats/dftbplus/__init__.py
dpdata/formats/dftbplus/output.py
dpdata/formats/fhi_aims/__init__.py
dpdata/formats/fhi_aims/output.py
dpdata/formats/gaussian/__init__.py
dpdata/formats/gaussian/fchk.py
dpdata/formats/gaussian/gjf.py
dpdata/formats/gaussian/log.py
dpdata/formats/gromacs/__init__.py
dpdata/formats/gromacs/gro.py
dpdata/formats/lammps/__init__.py
dpdata/formats/lammps/dump.py
dpdata/formats/lammps/lmp.py
dpdata/formats/lmdb/__init__.py
dpdata/formats/lmdb/format.py
dpdata/formats/openmx/__init__.py
dpdata/formats/openmx/omx.py
dpdata/formats/orca/__init__.py
dpdata/formats/orca/output.py
dpdata/formats/psi4/__init__.py
dpdata/formats/psi4/input.py
dpdata/formats/psi4/output.py
dpdata/formats/pwmat/__init__.py
dpdata/formats/pwmat/atomconfig.py
dpdata/formats/pwmat/movement.py
dpdata/formats/pymatgen/__init__.py
dpdata/formats/pymatgen/molecule.py
dpdata/formats/pymatgen/structure.py
dpdata/formats/qe/__init__.py
dpdata/formats/qe/scf.py
dpdata/formats/qe/traj.py
dpdata/formats/rdkit/__init__.py
dpdata/formats/rdkit/sanitize.py
dpdata/formats/rdkit/utils.py
dpdata/formats/siesta/__init__.py
dpdata/formats/siesta/aiMD_output.py
dpdata/formats/siesta/output.py
dpdata/formats/vasp/__init__.py
dpdata/formats/vasp/outcar.py
dpdata/formats/vasp/poscar.py
dpdata/formats/vasp/xml.py
dpdata/formats/xyz/__init__.py
dpdata/formats/xyz/quip_gap_xyz.py
dpdata/formats/xyz/xyz.py
dpdata/plugins/3dmol.py
dpdata/plugins/abacus.py
dpdata/plugins/amber.py
dpdata/plugins/cp2k.py
dpdata/plugins/deepmd.py
dpdata/plugins/dftbplus.py
dpdata/plugins/fhi_aims.py
dpdata/plugins/gaussian.py
dpdata/plugins/gromacs.py
dpdata/plugins/lammps.py
dpdata/plugins/lmdb.py
dpdata/plugins/openmx.py
dpdata/plugins/orca.py
dpdata/plugins/psi4.py
dpdata/plugins/pwmat.py
dpdata/plugins/pymatgen.py
dpdata/plugins/qe.py
dpdata/plugins/rdkit.py
dpdata/plugins/siesta.py
dpdata/plugins/vasp.py
dpdata/plugins/xyz.py
dpdata/siesta/__init__.py
dpdata/system.py
dpdata/vasp/__init__.py
dpdata/xyz/__init__.py
tests/context.py
tests/test_abacus_stru_dump.py
tests/test_amber_md.py
tests/test_cell_to_low_triangle.py
tests/test_gaussian_driver.py
tests/test_lammps_lmp_dump.py
tests/test_lammps_spin.py
tests/test_lmdb.py
tests/test_msd.py
tests/test_qe_cp_traj.py
tests/test_water_ions.py

✅ Files skipped from review due to trivial changes (20)

dpdata/formats/pwmat/movement.py
dpdata/formats/init.py
dpdata/formats/gromacs/gro.py
dpdata/formats/abacus/scf.py
dpdata/plugins/lmdb.py
dpdata/plugins/3dmol.py
tests/test_lmdb.py
dpdata/formats/gaussian/log.py
tests/test_amber_md.py
dpdata/formats/cp2k/output.py
dpdata/plugins/dftbplus.py
dpdata/formats/qe/init.py
dpdata/plugins/psi4.py
dpdata/formats/gaussian/init.py
dpdata/formats/cp2k/init.py
tests/test_qe_cp_traj.py
dpdata/plugins/cp2k.py
dpdata/plugins/lammps.py
dpdata/plugins/rdkit.py
tests/test_water_ions.py

🚧 Files skipped from review as they are similar to previous changes (21)

dpdata/formats/pwmat/atomconfig.py
tests/test_lammps_lmp_dump.py
dpdata/formats/amber/md.py
dpdata/formats/abacus/stru.py
dpdata/formats/qe/traj.py
tests/test_msd.py
dpdata/formats/gaussian/fchk.py
dpdata/plugins/orca.py
tests/test_cell_to_low_triangle.py
tests/context.py
dpdata/plugins/fhi_aims.py
dpdata/plugins/gromacs.py
dpdata/plugins/siesta.py
tests/test_abacus_stru_dump.py
dpdata/plugins/vasp.py
dpdata/formats/openmx/omx.py
dpdata/system.py
dpdata/plugins/deepmd.py
dpdata/plugins/gaussian.py
dpdata/plugins/amber.py
dpdata/plugins/qe.py

dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. deepmd DeePMD-kit format dpdata labels May 5, 2026

coderabbitai Bot reviewed May 5, 2026

View reviewed changes

Comment thread dpdata/__init__.py Outdated

Comment thread dpdata/plugins/openmx.py

njzjz-bot force-pushed the oc-fix-pr-946-ci branch 4 times, most recently from 4c16ae8 to fda1705 Compare May 5, 2026 08:18

njzjz requested a review from Copilot May 5, 2026 08:57

Copilot started reviewing on behalf of njzjz May 5, 2026 08:58 View session

Copilot AI reviewed May 5, 2026

View reviewed changes

Comment thread dpdata/__init__.py Outdated

Comment thread tests/context.py Outdated

Comment thread tests/test_lammps_spin.py

Comment thread tests/test_lammps_lmp_dump.py

njzjz-bot force-pushed the oc-fix-pr-946-ci branch 3 times, most recently from 92a9266 to e23e05e Compare May 5, 2026 09:09

njzjz-bot changed the title ~~fix: expose moved format helper modules~~ Fix CI after moving format modules May 5, 2026

njzjz-bot force-pushed the oc-fix-pr-946-ci branch 2 times, most recently from 2482d02 to ae36482 Compare May 5, 2026 11:24

njzjz linked an issue May 5, 2026 that may be closed by this pull request

[Feature Request] Reorganize the directory structure #934

Open

njzjz-bot changed the title ~~Fix CI after moving format modules~~ Refactor format backends into dpdata.formats May 5, 2026

njzjz requested a review from wanghan-iapcm May 5, 2026 16:01

njzjz changed the title ~~Refactor format backends into dpdata.formats~~ refact: refactor format backends into dpdata.formats May 5, 2026

OpenClaw Bot and others added 3 commits May 6, 2026 10:21

[pre-commit.ci] auto fixes from pre-commit.com hooks

d037542

for more information, see https://pre-commit.ci

fix: update helper imports after format move

802d4a0

njzjz-bot force-pushed the oc-fix-pr-946-ci branch from ae36482 to 9680f7c Compare May 6, 2026 10:31

dosubot Bot removed the size:M This PR changes 30-99 lines, ignoring generated files. label May 6, 2026

dosubot Bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label May 6, 2026

njzjz marked this pull request as draft May 7, 2026 04:16

test: update AMBER helper import after format move

d6252fc

Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)

njzjz-bot force-pushed the oc-fix-pr-946-ci branch from cfdb266 to d6252fc Compare May 7, 2026 05:44

coderabbitai Bot reviewed May 7, 2026

View reviewed changes

Conversation

njzjz-bot commented May 5, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

codspeed-hq Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will improve performance by 22.65%

Performance Changes

Uh oh!

codecov Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

wanghan-iapcm commented May 6, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

njzjz-bot commented May 5, 2026 •

edited by coderabbitai Bot

Loading

codspeed-hq Bot commented May 5, 2026 •

edited

Loading

codecov Bot commented May 5, 2026 •

edited

Loading

coderabbitai Bot commented May 5, 2026 •

edited

Loading